Source Localization for Dual Speech Enhancement Technology

نویسندگان

  • Seungil Kim
  • Hyejeong Jeon
چکیده

Many researchers have investigated multi-channel speech enhancement techniques which can be used for the pre-processing of the speech recognition system. Numerous microphones can give high performance, but they require additional hardware costs and generate the design problem about microphone position. Therefore speech enhancement technique using two microphones is preferred in mobile phone such as LG KM900, iPhone 4 and Nexus One. For enhancing the speech with two or more microphones, the spatial information from the input signal's incident angle should be used. Therefore, various sound source localization(SSL) methods have been used to estimate the talker’s direction-ofarrival(DOA). There are two main approaches to localization (Brandstein, 1995), (Dibase, 2000): the steered-beamformer approach, which includes various kinds of beamformers; and time-difference of arrival (TDOA) approach, which includes a generalized cross-correlation (GCC). The steered-beamformer approach has the capability of enhancing a desired signal that originates from a particular direction. The beamformer can steer its response at a particular angle; it can then find the spatial information required to maximize the beamformer output by scanning over a predefined spatial region. For this purpose, we can use a simple conventional delay-and-sum beamformer or many optimum beamfomers (Naguib, 1996). The TDOA approach uses classical time delay estimation techniques, such as cross-correlation, GCC, adaptive time delay estimation, and the adaptive eigenvalue decomposition algorithm (Chen et al., 2006). The most common time delay estimation method is the GCC, which consists of various types such as the unfiltered type, the maximum likelihood (ML) type, and the phase transform (PHAT) type. The GCC-PHAT is a widely used for TDOA estimation method because it works well in a realistic environment. The resolution of the DOA estimator is deeply related to the aperture size of the array and the number of microphone. A large aperture size and microphones make an accurate estimation result. Therefore, SSL method using two microphones cannot give the accurate direction-of-arrival (DOA) estimation result. Moreover, the implementation of a TDOA estimator requires a voice activity detector (Araki et al., 2007) or a speech/non speech detector (Lathoud, 2006). However, the TDOA estimation often shows a failed result in spite of these kinds of additional processing. Hence, reliable SSL algorithm is needed for dual channel speech enhancement system.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A New Shuffled Sub-swarm Particle Swarm Optimization Algorithm for Speech Enhancement

In this paper, we propose a novel algorithm to enhance the noisy speech in the framework of dual-channel speech enhancement. The new method is a hybrid optimization algorithm, which employs the  combination of  the  conventional θ-PSO and the shuffled sub-swarms particle optimization (SSPSO) technique. It is known that the θ-PSO algorithm has better optimization performance than standard PSO al...

متن کامل

A Microphone Array System for Speech Source Localization, Denoising, and Dereverberation

There is a great deal of potential for advancement in distant-talker speech acquisition research, and a wealth of current and future technology depends upon these advances. The goal of this work is to allow users the opportunity to roam unfettered in diverse environments while still providing a high quality speech signal and a robustness to background noise and reverberation effects. In this th...

متن کامل

A two-microphone dual delay-line approach for extraction of a speech sound in the presence of multiple interferers.

This paper describes algorithms for signal extraction for use as a front-end of telecommunication devices, speech recognition systems, as well as hearing aids that operate in noisy environments. The development was based on some independent, hypothesized theories of the computational mechanics of biological systems in which directional hearing is enabled mainly by binaural processing of interau...

متن کامل

Direction of ArrivAl estimAtion AnD locAlizAtion of multiple speech sources in encloseD environments

Speech communication is gaining in popularity in many different contexts as technology evolves. With the introduction of mobile electronic devices such as cell phones and laptops, and fixed electronic devices such as video and teleconferencing systems, more people are communicating which leads to an increasing demand for new services and better speech quality. Methods to enhance speech recorded...

متن کامل

Speech Enhancement by Modified Convex Combination of Fractional Adaptive Filtering

This paper presents new adaptive filtering techniques used in speech enhancement system. Adaptive filtering schemes are subjected to different trade-offs regarding their steady-state misadjustment, speed of convergence, and tracking performance. Fractional Least-Mean-Square (FLMS) is a new adaptive algorithm which has better performance than the conventional LMS algorithm. Normalization of LMS ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012